Assignment Objectives
Master the fundamental concepts of point estimation and
performance metrics
Understand the theoretical foundation of the method of moments
estimator (MME)
Implement MME in R, incorporating numerical approximation
methods
Use of AI Tools
Policy on AI Tool Use: Students must adhere to the
AI tool policy specified in the course syllabus. The direct copying of
AI-generated content is strictly prohibited. All submitted work must
reflect your own understanding; where external tools are consulted,
content must be thoroughly rephrased and synthesized in your own
words.
Code Inclusion Requirement: Any code included in
your essay must be properly commented to explain the purpose and/or
expected output of key code lines. Submitting AI-generated code without
meaningful, student-added comments will not be accepted.
Log-logistic Distribution
The log-logistic distribution (also known as the Fisk distribution)
is a continuous probability distribution that is particularly useful in
contexts where data exhibit non-negative, skewed behavior and where the
hazard rate is unimodal (increases to a peak and then decreases). It has
been widely used in the areas such as survival analysis and reliability
engineering, environmental science, economics, pharmacology, finance and
risk management, etc.
For given shape parameter \(\beta\)
and scale parameter \(\alpha\), the
cumulative distribution function
\[
F(x) = \frac{1}{1+(x/\alpha)^{-\beta}}
\]
As an exercise, you can derive the density in the following form
\[
f(x) =
\frac{(\beta/\alpha)(x/\alpha)^{\beta-1}}{[1+(x/\alpha)^\beta]^2}, \ \
\text{ for } \ \ x > 0.
\]
After some algebra, we can find the \(k\)th moment
\[
\mu_k = E[X^k] = \alpha^k B\left(1+\frac{k}{\beta}, 1 - \frac{k}{\beta}
\right).
\]
This assignment will focus on finding MME of parameters \(\alpha\) and \(\beta\) based on a real-world application
data set.
Question 2: Distribution of Recovery Time from A
Surgery
Time to recovery (in days) after a specific knee surgery procedure.
This follows a typical log-logistic pattern in medical
survival/recovery analysis:
8.23, 12.74, 14.83, 16.61, 18.16, 19.55, 20.80, 21.94, 23.00, 23.98, 24.89, 25.75, 26.56,
27.34, 28.08, 28.79, 29.48, 30.15, 30.81, 31.45, 32.08, 32.70, 33.31, 33.92, 34.53, 35.13,
35.73, 36.33, 36.93, 37.53, 38.14, 38.75, 39.37, 40.00, 40.64, 41.29, 41.95, 42.63, 43.33,
44.05, 44.79, 45.56, 46.36, 47.20, 48.08, 49.02, 50.03, 51.12, 52.32, 53.65
Based on the above data to perform the following analysis.
- Using method of moment estimation to estimate \(\alpha\) and \(\beta\), denoted by \(\hat{\alpha}\) and \(\hat{\beta}\), respectively.
# --------------- STORING SAMPLE DATA AND CALCULATING M1 AND M2 ------------------------------------------------
recovery_sample <- c(8.23, 12.74, 14.83, 16.61, 18.16, 19.55, 20.80, 21.94, 23.00, 23.98, 24.89, 25.75, 26.56,
27.34, 28.08, 28.79, 29.48, 30.15, 30.81, 31.45, 32.08, 32.70, 33.31, 33.92, 34.53, 35.13,
35.73, 36.33, 36.93, 37.53, 38.14, 38.75, 39.37, 40.00, 40.64, 41.29, 41.95, 42.63, 43.33,
44.05, 44.79, 45.56, 46.36, 47.20, 48.08, 49.02, 50.03, 51.12, 52.32, 53.65)
# recovery_sample: store sample recovery times as a vector
m1 <- mean(recovery_sample)
# m1: first moment of the sample
m2 <- mean(recovery_sample^2)
# m2: second moment of the sample
# --------------- CREATING BETA ESTIMATOR FUNCTION ------------------------------------------------------------
beta_estimator <- function(K, m1, m2)
# beta_estimator: function with three arguments
# B: Beta. We will solve for this later using numerical approximation.
# m1: first moment of the sample
# m2: second moment of the sample
{
# this bracket begins the definition of the function
num <- beta(1 + 2/K, 1 - 2/K)
# num: is the numerator of our equation isolating Beta
denom <- (beta(1 + 1/K, 1 - 1/K))^2
# denom: is the denominator of our equation isolating Beta
result <- num / denom - m2 / m1^2
# result: the simplified system of equations of population and sample moments where Beta is isolated
# We will set this equal to zero and solve for Beta later.
return(result)
}
# this bracket ends the definition of the function
# --------------- GRAPHING TO ESTIMATE LOCATION OF X INT -----------------------------------------------------
x_vals <- seq(5, 10, length.out = 50)
# x_vals: numeric vector of 50 equally spaced values starting at 5 and ending at 10
dframe <- data.frame(x = x_vals, y = beta_estimator(x_vals, m1, m2))
#dframe: data frame with two columns
# x: the vector of input values (x_vals)
# y: the vector of estimated Beta values
beta_estimator.plot <- ggplot(dframe, aes(x = x, y = y)) +
# plot dframe data. use x as x-axis and y as y-axis
geom_line(color = "steelblue", size = 1) +
# draw a blue line connecting all the data points
geom_hline(yintercept = 0, linetype = "dashed", alpha = 0.5) +
# create dashed line along the x-axis
geom_vline(xintercept = 0, linetype = "dashed", alpha = 0.5) +
# create dashed line along the y-axis
labs(title = "The curve of function beta_estimator",
x = "Beta_Estimator",
y = "Height of Function") +
# Create titles
theme(plot.title = element_text(hjust = 0.5),
# Centers the plot title
plot.margin = margin(t = 35, r = 20, b = 30, l = 30, unit = "pt"))
# Adds extra padding around the graph so titles are less cramped
ggplotly(beta_estimator.plot) # Reads the plot we created and creates an interactive widget
Insight: The X-intercept is somewhere between 5 and
7. In the next step, we will look for a solution (beta hat) along that
interval.
# --------------- USING NUMERICAL APPROXIMATION TO FIND BETA HAT ------------------------------------------------
beta_hat <- uniroot(beta_estimator, interval = c(5, 7), m1, m2)$root
alpha_hat <- m1 / beta(1 + 1/beta_hat, 1 - 1/beta_hat)
# We plug beta hat back into the equation for the first moment to find alpha hat
pander(cbind(beta_hat = beta_hat, alpha_hat = alpha_hat))# cbind: creates a two column matrix for beta_hat and alpha_hat. Titles are alpha_hat and beta_hat. pander renders a clean readable table
- Since the moment estimates \(\hat{\alpha}\) and \(\hat{\beta}\) are random, construct
bootstrap sampling distributions for each. To visualize these
distributions, plot separate bootstrap histograms for \(\hat{\alpha}\) and \(\hat{\beta}\). then, overlay a smooth
density curve on each histogram using Gaussian kernel density
estimation. Finally, describe the patterns of these density curves.
# -----------------DEFINING THE BOOTSTRAP FUNCTION -----------------------------------------------------------------------
bootstrap_moments <- function(data, B)
# bootstrap_moments: function with two arguments
# data: the random sample from the population with which you will make bootstrap samples
# B: the number of bootstrap samples you plan to create
{
# this bracket begins the definition of the function
n <- length(data)
# n: the number of observations in your random sample
t_obs_first_mom <- mean(data)
t_obs_second_mom <- mean(data^2)
# t_obs: the computed statistic for your original random sample. This only needs to be computed once.
# t_obs_first_mom represents your first sample moment and t_obs_second_mom is your second sample moment.
# this will not become part of your bootstrap distribution. This is your actual estimate from the real world.
boot_first_mom_dist <- numeric(B)
boot_second_mom_dist <- numeric(B)
# boot_dist: empty vector of length B that will eventually store all the bootstrap statistics you will compute
for (b in 1:B)
{
# everything inside of this bracket will loop B times, filling the vectors boot_first_mom_dist and boot_second_mom_dist
indices <- sample(1:n, n, replace = TRUE)
# the sample function samples with replacement from the vector 1:n (the integers 1,2,3, ... n) until sample size is n
# the result is a vector called indices which stores the position of data values we'd like in our bootstrap sample
bootstrap_sample <- data[indices]
# we index our data with the indices vector to create our bootstrap sample
boot_first_mom_dist[b] <- mean(bootstrap_sample)
# we compute the first moment of our bootstrap sample
boot_second_mom_dist[b] <- mean(bootstrap_sample^2)
# we compute the second moment of our bootstrap sample
# and then feed those values into our bootstrap distribution vectors, filling in position b
}
# everything after this bracket is outside the loop
return(list(observed_first_mom = t_obs_first_mom, observed_second_mom = t_obs_second_mom, distribution_first_mom = boot_first_mom_dist,
distribution_second_mom = boot_second_mom_dist))
# save all of our data in a list
}
# this bracket ends the definition of the function
# ----------------- CREATING OUR BOOTSTRAP ESTIMATES -----------------------------------------------------------------------
boot_moms <- bootstrap_moments(recovery_sample,5000)
# boot_moms: a list of 5000 bootstrap first moments, 5000 bootstrap second moments, as well as the first and second moments from the sample data.
beta_bootstrap_estimates <- numeric(length(boot_moms$distribution_first_mom))
alpha_bootstrap_estimates <- numeric(length(boot_moms$distribution_first_mom))
# Creating empty vectors the length of distribution_first_mom
# We will use these to store our alpha and beta estimates
for (i in 1:length(beta_bootstrap_estimates))
# everything inside of this bracket will loop 5000 times, filling the vectors beta_bootstrap_estimates and alpha_bootstrap_estimates
{
m1b <- boot_moms$distribution_first_mom[i]
# we gather the i-th first bootstrap moment
m2b <- boot_moms$distribution_second_mom[i]
# we gather the i-th second bootstrap moment
beta_bootstrap_estimates[i] <- uniroot(beta_estimator, interval = c(2, 15), m1 = m1b, m2 = m2b)$root
# we compute the the i-th beta estimate
alpha_bootstrap_estimates[i] <- m1b / beta(1 + 1/beta_bootstrap_estimates[i], 1 - 1/beta_bootstrap_estimates[i])
# we compute the the i-th alpha estimate
# and then feed those values into our bootstrap estimate vectors, filling in position i
}
Answer: The bootstrap sampling distribution of alpha
hat is approximately normal. It appears bell‑shaped, unimodal, and
symmetric around its center. In contrast, the bootstrap distribution of
beta hat is noticeably right‑skewed. This difference arises from the way
each estimator is constructed. The estimator alpha hat is a smooth,
nearly linear transformation of the sample mean. Meanwhile, beta hat is
obtained as the numerical root of a nonlinear moment equation involving
the ratio 𝑚2 / 𝑚1^ 2 . Because the second moment 𝑚2 has higher sampling
variability and is more sensitive to large observations, the sampling
distribution of beta hat develops a longer right tail.
# ----------------- GRAPHING OUR BOOTSTRAP ESTIMATE DISTRIBUTIONS -----------------------------------------------------------------------
df_alpha <- data.frame(alpha_hat = alpha_bootstrap_estimates)
# df_alpha: ggplot cannot take vectors, so we need to create a data frame to give it
df_beta <- data.frame(beta_hat = beta_bootstrap_estimates)
p_alpha <- ggplot(df_alpha, aes(x = alpha_hat)) +
geom_histogram(aes(y = ..density..), bins = 30, fill = "skyblue", color = "white", alpha = 0.7) +
geom_density(kernel = "gaussian", color = "blue", linewidth = 1.2) +
labs(
title = "Bootstrap Distribution of α̂",
x = expression(hat(alpha)),
y = "Density"
) +
theme_minimal()
p_alpha

p_beta <- ggplot(df_beta, aes(x = beta_hat)) +
geom_histogram(aes(y = ..density..), bins = 30, fill = "salmon", color = "white", alpha = 0.7) +
geom_density(kernel = "gaussian", color = "red", linewidth = 1.2) +
labs(
title = "Bootstrap Distribution of β̂",
x = expression(hat(beta)),
y = "Density"
) +
theme_minimal()
p_beta

---
title: "Assignment 4: Methods of Moment Estimation"
author: "Matthew Gray"
date: "Due: Feburary 24, 2026"
output:
  html_document: 
    toc: yes
    toc_depth: 4
    toc_float: yes
    number_sections: no
    toc_collapsed: yes
    code_folding: hide
    code_download: yes
    smooth_scroll: yes
    theme: lumen
  pdf_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    number_sections: yes
    fig_width: 3
    fig_height: 3
  word_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    keep_md: yes
editor_options: 
  chunk_output_type: inline
---

```{css, echo = FALSE}
#TOC::before {
  content: "Table of Contents";
  font-weight: bold;
  font-size: 1.2em;
  display: block;
  color: navy;
  margin-bottom: 10px;
}


div#TOC li {     /* table of content  */
    list-style:upper-roman;
    background-image:none;
    background-repeat:none;
    background-position:0;
}

h1.title {    /* level 1 header of title  */
  font-size: 22px;
  font-weight: bold;
  color: DarkRed;
  text-align: center;
  font-family: "Gill Sans", sans-serif;
}

h4.author { /* Header 4 - and the author and data headers use this too  */
  font-size: 15px;
  font-weight: bold;
  font-family: system-ui;
  color: navy;
  text-align: center;
}

h4.date { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Gill Sans", sans-serif;
  color: DarkBlue;
  text-align: center;
}

h1 { /* Header 1 - and the author and data headers use this too  */
    font-size: 20px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: center;
}

h2 { /* Header 2 - and the author and data headers use this too  */
    font-size: 18px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { /* Header 3 - and the author and data headers use this too  */
    font-size: 16px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h4 { /* Header 4 - and the author and data headers use this too  */
    font-size: 14px;
  font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

/* Add dots after numbered headers */
.header-section-number::after {
  content: ".";

body { background-color:white; }

.highlightme { background-color:yellow; }

p { background-color:white; }

}
```

```{r setup, include=FALSE}
# code chunk specifies whether the R code, warnings, and output 
# will be included in the output files.
if (!require("knitr")) {
   install.packages("knitr")
   library(knitr)
}
if (!require("pander")) {
   install.packages("pander")
   library(pander)
}
if (!require("ggplot2")) {
  install.packages("ggplot2")
  library(ggplot2)
}
if (!require("tidyverse")) {
  install.packages("tidyverse")
  library(tidyverse)
}

if (!require("plotly")) {
  install.packages("plotly")
  library(plotly)
}
####
knitr::opts_chunk$set(echo = TRUE,       # include code chunk in the output file
                      warning = FALSE,   # sometimes, you code may produce warning messages,
                                         # you can choose to include the warning messages in
                                         # the output file. 
                      results = TRUE,    # you can also decide whether to include the output
                                         # in the output file.
                      message = FALSE,
                      comment = NA
                      )  
```
 
 \
 
## **Assignment Objectives** 

* Master the fundamental concepts of point estimation and performance metrics

* Understand the theoretical foundation of the method of moments estimator (MME)

* Implement MME in R, incorporating numerical approximation methods

\

**Use of AI Tools**

**Policy on AI Tool Use**: Students must adhere to the AI tool policy specified in the course syllabus. The direct copying of AI-generated content is strictly prohibited. All submitted work must reflect your own understanding; where external tools are consulted, content must be thoroughly rephrased and synthesized in your own words.

**Code Inclusion Requirement**: Any code included in your essay must be properly commented to explain the purpose and/or expected output of key code lines. Submitting AI-generated code without meaningful, student-added comments will not be accepted.

\

**Log-logistic Distribution**

The log-logistic distribution (also known as the Fisk distribution) is a continuous probability distribution that is particularly useful in contexts where data exhibit non-negative, skewed behavior and where the hazard rate is unimodal (increases to a peak and then decreases). It has been widely used in the areas such as survival analysis and reliability engineering, environmental science, economics, pharmacology, finance and risk management, etc. 

For given shape parameter $\beta$ and scale parameter $\alpha$, the cumulative distribution function

$$
F(x) = \frac{1}{1+(x/\alpha)^{-\beta}}
$$

As an exercise, you can derive the density in the following form

$$
f(x) = \frac{(\beta/\alpha)(x/\alpha)^{\beta-1}}{[1+(x/\alpha)^\beta]^2}, \ \ \text{ for } \ \ x > 0.
$$

After some algebra, we can find the $k$th moment

$$
\mu_k = E[X^k] = \alpha^k B\left(1+\frac{k}{\beta}, 1 - \frac{k}{\beta} \right).
$$

This assignment will focus on finding MME of parameters $\alpha$ and $\beta$ based on a real-world application data set.


\

## **Question 1: Derive the log-logistic density function **

Given the CDF of the two-parameter log-logistic distribution

$$
F(x) = \frac{1}{1+(x/\alpha)^{-\beta}}.
$$

\

```{r}
# ------------- COMPUTE THE DERIVATIVE USING R ----------------
CDF_Fisk <- expression(1 / (1 + (x/alpha)^(-beta)))
# Store the CDF as an expression
PDF_Fisk <- D(CDF_Fisk, "x")
# Compute the derivative of the expression with respect to x
```
Rewrite the CDF as
\[
F(x) = \left( 1 + (x/\alpha)^{-\beta} \right)^{-1}
\]

\

Apply the chain rule:
\[
f(x)
= - \left( 1 + (x/\alpha)^{-\beta} \right)^{-2}
  \cdot -\beta (x/\alpha)^{-\beta - 1} \cdot \frac{1}{\alpha}
\]

\

Simplify:
\[
f(x)
= \left( 1 + (x/\alpha)^{-\beta} \right)^{-2}
  \cdot \beta/\alpha (x/\alpha)^{-\beta - 1}
\]

\

Re-write the first term:
\[
f(x)
= \left( \frac{(x/\alpha)^\beta + 1}{(x/\alpha)^\beta} \right)^{-2}
  \cdot \frac{\beta}{\alpha} (x/\alpha)^{-\beta - 1}
\]

\

Break apart the first term: 
\[
f(x)
= \left( (x/\alpha)^\beta + 1 \right)^{-2}
  \cdot (x/\alpha)^{2\beta}
  \cdot \frac{\beta}{\alpha} (x/\alpha)^{-\beta - 1}
\]

\

Combine terms: 
\[
f(x)
= \left( (x/\alpha)^\beta + 1 \right)^{-2}
  \cdot \frac{\beta}{\alpha} (x/\alpha)^{\beta - 1}
\]

\

Rearrange:
\[
f(x)
= \frac{(\beta/\alpha)(x/\alpha)^{\beta - 1}}
       {\left[ 1 + (x/\alpha)^{\beta} \right]^2},
\qquad x > 0
\]

\


## **Question 2: Distribution of Recovery Time from A Surgery**

Time to recovery (in days) after a specific knee surgery procedure. This follows a typical **log-logistic pattern** in medical survival/recovery analysis:

```
8.23, 12.74, 14.83, 16.61, 18.16, 19.55, 20.80, 21.94, 23.00, 23.98, 24.89, 25.75, 26.56, 
27.34, 28.08, 28.79, 29.48, 30.15, 30.81, 31.45, 32.08, 32.70, 33.31, 33.92, 34.53, 35.13, 
35.73, 36.33, 36.93, 37.53, 38.14, 38.75, 39.37, 40.00, 40.64, 41.29, 41.95, 42.63, 43.33, 
44.05, 44.79, 45.56, 46.36, 47.20, 48.08, 49.02, 50.03, 51.12, 52.32, 53.65
```
Based on the above data to perform the following analysis.

a) Using method of moment estimation to estimate $\alpha$ and $\beta$, denoted by $\hat{\alpha}$ and $\hat{\beta}$, respectively.

``` {r}
# --------------- STORING SAMPLE DATA AND CALCULATING M1 AND M2 ------------------------------------------------
recovery_sample <- c(8.23, 12.74, 14.83, 16.61, 18.16, 19.55, 20.80, 21.94, 23.00, 23.98, 24.89, 25.75, 26.56, 
27.34, 28.08, 28.79, 29.48, 30.15, 30.81, 31.45, 32.08, 32.70, 33.31, 33.92, 34.53, 35.13, 
35.73, 36.33, 36.93, 37.53, 38.14, 38.75, 39.37, 40.00, 40.64, 41.29, 41.95, 42.63, 43.33, 
44.05, 44.79, 45.56, 46.36, 47.20, 48.08, 49.02, 50.03, 51.12, 52.32, 53.65)
# recovery_sample: store sample recovery times as a vector

m1 <- mean(recovery_sample)
# m1: first moment of the sample
m2 <- mean(recovery_sample^2)
# m2: second moment of the sample
```

``` {r}
# --------------- CREATING BETA ESTIMATOR FUNCTION ------------------------------------------------------------
beta_estimator <- function(K, m1, m2) 
# beta_estimator: function with three arguments
# B: Beta. We will solve for this later using numerical approximation.
# m1: first moment of the sample
# m2: second moment of the sample
{
# this bracket begins the definition of the function
num <- beta(1 + 2/K, 1 - 2/K)
# num: is the numerator of our equation isolating Beta
denom <- (beta(1 + 1/K, 1 - 1/K))^2
# denom: is the denominator of our equation isolating Beta
result <- num / denom - m2 / m1^2
# result: the simplified system of equations of population and sample moments where Beta is isolated
# We will set this equal to zero and solve for Beta later. 
return(result)
}
# this bracket ends the definition of the function
```

``` {r}
# --------------- GRAPHING TO ESTIMATE LOCATION OF X INT  -----------------------------------------------------
x_vals <- seq(5, 10, length.out = 50)
# x_vals: numeric vector of 50 equally spaced values starting at 5 and ending at 10
dframe <- data.frame(x = x_vals, y = beta_estimator(x_vals, m1, m2))
#dframe: data frame with two columns
# x: the vector of input values (x_vals)
# y: the vector of estimated Beta values

beta_estimator.plot <- ggplot(dframe, aes(x = x, y = y)) +
# plot dframe data. use x as x-axis and y as y-axis
geom_line(color = "steelblue", size = 1) +
# draw a blue line connecting all the data points
geom_hline(yintercept = 0, linetype = "dashed", alpha = 0.5) +
# create dashed line along the x-axis
geom_vline(xintercept = 0, linetype = "dashed", alpha = 0.5) +
# create dashed line along the y-axis
labs(title = "The curve of function beta_estimator",
x = "Beta_Estimator", 
y = "Height of Function") +
# Create titles
theme(plot.title = element_text(hjust = 0.5),
# Centers the plot title
plot.margin = margin(t = 35, r = 20, b = 30, l = 30, unit = "pt"))
# Adds extra padding around the graph so titles are less cramped
ggplotly(beta_estimator.plot) # Reads the plot we created and creates an interactive widget
```

**Insight**:
The X-intercept is somewhere between 5 and 7. In the next step, we will look for a solution (beta hat) along that interval.


``` {r}
# --------------- USING NUMERICAL APPROXIMATION TO FIND BETA HAT  ------------------------------------------------
beta_hat <- uniroot(beta_estimator, interval = c(5, 7), m1, m2)$root
alpha_hat <- m1 / beta(1 + 1/beta_hat, 1 - 1/beta_hat)
# We plug beta hat back into the equation for the first moment to find alpha hat

pander(cbind(beta_hat = beta_hat, alpha_hat = alpha_hat))# cbind: creates a two column matrix for beta_hat and alpha_hat. Titles are alpha_hat and beta_hat. pander renders a clean readable table
```

b) Since the moment estimates $\hat{\alpha}$ and $\hat{\beta}$ are random, construct bootstrap sampling distributions for each. To visualize these distributions, plot separate bootstrap histograms for $\hat{\alpha}$ and $\hat{\beta}$.  then, overlay a smooth density curve on each histogram using Gaussian kernel density estimation. Finally, describe the patterns of these density curves.

```{r}
# -----------------DEFINING THE BOOTSTRAP FUNCTION -----------------------------------------------------------------------
bootstrap_moments <- function(data, B) 
# bootstrap_moments: function with two arguments
# data: the random sample from the population with which you will make bootstrap samples
# B: the number of bootstrap samples you plan to create
{
# this bracket begins the definition of the function
n <- length(data)
# n: the number of observations in your random sample
t_obs_first_mom <- mean(data)
t_obs_second_mom <- mean(data^2)
# t_obs: the computed statistic for your original random sample. This only needs to be computed once.
# t_obs_first_mom represents your first sample moment and t_obs_second_mom is your second sample moment.
# this will not become part of your bootstrap distribution. This is your actual estimate from the real world.
boot_first_mom_dist <- numeric(B)
boot_second_mom_dist <- numeric(B)
# boot_dist: empty vector of length B that will eventually store all the bootstrap statistics you will compute

for (b in 1:B) 
{
# everything inside of this bracket will loop B times, filling the vectors boot_first_mom_dist and boot_second_mom_dist
indices <- sample(1:n, n, replace = TRUE)
# the sample function samples with replacement from the vector 1:n (the integers 1,2,3, ... n) until sample size is n
# the result is a vector called indices which stores the position of data values we'd like in our bootstrap sample
bootstrap_sample <- data[indices]
# we index our data with the indices vector to create our bootstrap sample
boot_first_mom_dist[b] <- mean(bootstrap_sample)
# we compute the first moment of our bootstrap sample 
boot_second_mom_dist[b] <- mean(bootstrap_sample^2)
# we compute the second moment of our bootstrap sample 
# and then feed those values into our bootstrap distribution vectors, filling in position b
}
# everything after this bracket is outside the loop
  
return(list(observed_first_mom = t_obs_first_mom, observed_second_mom = t_obs_second_mom, distribution_first_mom = boot_first_mom_dist,
distribution_second_mom = boot_second_mom_dist))
# save all of our data in a list
}
# this bracket ends the definition of the function
```

```{r}
# ----------------- CREATING OUR BOOTSTRAP ESTIMATES -----------------------------------------------------------------------
boot_moms <- bootstrap_moments(recovery_sample,5000)
# boot_moms: a list of 5000 bootstrap first moments, 5000 bootstrap second moments, as well as the first and second moments from the sample data.

beta_bootstrap_estimates <- numeric(length(boot_moms$distribution_first_mom))
alpha_bootstrap_estimates <- numeric(length(boot_moms$distribution_first_mom))
# Creating empty vectors the length of distribution_first_mom
# We will use these to store our alpha and beta estimates

for (i in 1:length(beta_bootstrap_estimates)) 
# everything inside of this bracket will loop 5000 times, filling the vectors beta_bootstrap_estimates and alpha_bootstrap_estimates
{
m1b <- boot_moms$distribution_first_mom[i]
# we gather the i-th first bootstrap moment 
m2b <- boot_moms$distribution_second_mom[i]
# we gather the i-th second bootstrap moment 
beta_bootstrap_estimates[i] <- uniroot(beta_estimator, interval = c(2, 15), m1 = m1b, m2 = m2b)$root
# we compute the the i-th beta estimate 
alpha_bootstrap_estimates[i] <- m1b / beta(1 + 1/beta_bootstrap_estimates[i], 1 - 1/beta_bootstrap_estimates[i])
# we compute the the i-th alpha estimate 
# and then feed those values into our bootstrap estimate vectors, filling in position i
}
```

**Answer**:
The bootstrap sampling distribution of alpha hat is approximately normal. It appears bell‑shaped, unimodal, and symmetric around its center. In contrast, the bootstrap distribution of beta hat is noticeably right‑skewed. This difference arises from the way each estimator is constructed. The estimator alpha hat is a smooth, nearly linear transformation of the sample mean. Meanwhile, beta hat is obtained as the numerical root of a nonlinear moment equation involving the ratio 𝑚2 / 𝑚1^ 2 . Because the second moment 𝑚2 has higher sampling variability and is more sensitive to large observations, the sampling distribution of beta hat develops a longer right tail.

``` {r}
# ----------------- GRAPHING OUR BOOTSTRAP ESTIMATE DISTRIBUTIONS -----------------------------------------------------------------------
df_alpha <- data.frame(alpha_hat = alpha_bootstrap_estimates)
# df_alpha: ggplot cannot take vectors, so we need to create a data frame to give it
df_beta  <- data.frame(beta_hat  = beta_bootstrap_estimates)

p_alpha <- ggplot(df_alpha, aes(x = alpha_hat)) +
geom_histogram(aes(y = ..density..), bins = 30, fill = "skyblue", color = "white", alpha = 0.7) +
geom_density(kernel = "gaussian", color = "blue", linewidth = 1.2) +
labs(
title = "Bootstrap Distribution of α̂",
x = expression(hat(alpha)),
y = "Density"
) +
theme_minimal()
p_alpha

p_beta <- ggplot(df_beta, aes(x = beta_hat)) +
geom_histogram(aes(y = ..density..), bins = 30, fill = "salmon", color = "white", alpha = 0.7) +
geom_density(kernel = "gaussian", color = "red", linewidth = 1.2) +
labs(
title = "Bootstrap Distribution of β̂",
x = expression(hat(beta)),
y = "Density"
) +
theme_minimal()
p_beta
```







